Tests are often flaky because the resolution of the OSX filesystem timestamps
appears to be 1s, so we often have to manually sleep for 1s to ensure that
enough time has passed to guarantee that the timestamp is different (used in
freshness calculations).